課程資訊
課程名稱
資訊檢索
Information Retrieval 
開課學期
109-1 
授課對象
文學院  圖書資訊學系  
授課教師
唐牧群 
課號
LIS4012 
課程識別碼
106 47000 
班次
 
學分
3.0 
全/半年
半年 
必/選修
必帶 
上課時間
星期三6,7,8(13:20~16:20) 
上課地點
圖資視聽室 
備註
總人數上限:70人 
Ceiba 課程網頁
http://ceiba.ntu.edu.tw/1091LIS4012_ 
課程簡介影片
 
核心能力關聯
核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

The course is designed to provide an introduction to the use, design and evaluation of information (IR) systems. It covers major components in the IR process such as search strategies, indexing, IR models and IR evaluation. Students will also acquire hand-on experiences with IR evaluation and designing a digital library system. Special attention will be given to the comparision of different indexing methods and IR models and how they might be complement each other. 

課程目標
To provide an introduction to the use, design and evaluation of information (IR) systems 
課程要求
待補 
預期每週課後學習時數
 
Office Hours
 
指定閱讀
待補 
參考書目
Bell, S. S.(2006). Librarian's guide to online searching.
Bhavani, S. K. K. Drabenstott, D. Radev (2000). Towards a unified framework of IR tasks and strategies.
Manning, Raghavan, Schutze (2008). Introduction to Informaiton Retrieval. Cambridge.
Chowdhury, G.G. (2004), Introduction to modern information retrieval. London: Facet publishing.
William, H. R. (1996). Information retrieval : a health and biomedical perspective. New York: Springer-Verlag New York, Inc.
Salton & McGill (1983). Introduction to modern information retrieval. McGraw-Hill..
Growssman, and Frieder (2004). Information retrieval: algorithms and Heuristics
Belew, Richard K. (2000). Finding out about: a cognitive perspective on search engine technology and the WWW. Cambridge: Cambridge University Press.
O'Connor, B. (1996). Explorations in indexing and abstracting.
Lancster, 2003. Indexing and abstracting in theory and practice.
Evaluation of Web-Based Search Engines Using User-Effort Measures. Availableonline: http://libres.curtin.edu.au/libres13n2/tang.htm
Ian H. Witten, David Bainbridge (2003). How to Build a Digital Library, Amsterdam: Morgan Kaufmann Publishers.
Janach, D., M. Zanker, A. Felfernig, G. Friedrich (2011). Recommender systems: an introduction. Cambridge.
Soergel (1985). Organizing information: principles of data base and retrieval systems Academic Press Professional, Inc. San Diego, CA.
Camtasia
Download ; Video tutorial
PubMed
PubMed tutorials, available http://www.nlm.nih.gov/bsd/disted/pubmed.html
OVID SP tutorial from Yale University Library
SCOPUS tutorial
PubMed help Available at Online_help
http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/helppubmed/pubmedhelp.pdf
Greenstone: The software can be downloaded at
http://www.greenstone.org/cgi-bin/library?e=p-en-home-utfZz-8&a=p&p=download
Manuals (the “User’s guide” is most relevant to our purpose) http://greenstone.sourceforge.net/wiki/index.php/Manual 
評量方式
(僅供參考)
 
No.
項目
百分比
說明
1. 
 
0% 
Attendance to all class sessions is mandatory. Your grade will be judged based on your participation, homeworks, and in-class assignments. For your group projects will be judged by both the instructor 
2. 
Homework and participation 
10% 
There will be one homework on the practice of text tokenization and TFxIDF calculation. 
3. 
Group projects 
0% 
Students will form into groups of 3 to 5 to conduct 3 group projects. For each project, besides the group reports, *** each group member should prepare an one to two paragraphs personal report explaining your contributions and what you have learned from the assignments. 
4. 
Search feature/command demo 
0% 
create and present a video demo that explains a search tactics or function available with Ovid/medline, Ebsco/Medline, Embase, or Scopus.( (accounts for 10% of your final grade) See example 
5. 
Simulated literature search evaluation 
30% 
a. To obtain the search topics, interview two users (preferably graduate students or faculty members in sciences), each on one research topic they are interested in. Collect from each user: a search statement and associated query terms that you both agree best represent her information need. Also try to characterize her information need using attributes such as "topic familiarity" and "uncertainty". b. For each search topic, submit the queries on the user’s behalf to Google Scholar , Microsoft academic search, Semantic scholar or other major citation databases (e.g. Scopus, WOS). Collect the first 30 links from each of the two returned sets. c. Find out the degree of overlap among the two returned sets. d. Mix the non-duplicative (30X2, maximum) links together and strip the graphic cues. This is done so that the user will not be able to tell which search engine each link is from. e. For each link, marks its original and rank position. f. Present the URLs in Microsoft Word files that allow the users to examine the actual webpage by clicking on its hyperlink. Ask them to judge the relevance (topical as well as situational) of the pages based on a 0-4 scale (0 stands for not relevant at all; 4, very relevant). g. Create an EXCEL or SPSS data file to input the relevance scores. h. Compare the performance of the search engines based on 1) Mean Average Precision, 2) CG and DCG I. Next submit the same query to Scopus and Web of Science and conduct a domain analysis, in which you will identify the publication trends, major authors, institues, journals, countries, and disciplines that have published in this area. 
6. 
Digital library construction 
30% 
Each group will build a functional online digital library collaboratively using WordPress, Joomla , or Greenstone digital library (GSDL) open source content management system. DL_project_exampl1 DL_project_example2 DL_project_example3 The project consists of three components: the implantation of a digital collection on the topic of your own choosing, a written report (5-6 pages) and an oral presentation of the project. The digital collection should include: a. A minimum of 70 documents representative of different document formats such as pdf, word, and html. b. An index structure that enables browsing of the collection c. The provision of faceted and fielded search The written report (4-6 pages) should: d. Explain the aim, purpose, sources, intended users and their information needs of the collection. It is better that you come up with an institutional context (real or imaginary) for the use of the collection. e. Define your selection and indexing policies (human and machine indexing components; metadata structure) based on the aim and purpose stated above. f. Include a graphic presentation of the browsable index structure and the rationales behind your design (i.e. explain why you choose certain browsable facets and searchable fields to represent your collection)  
7. 
Final exam 
30% 
The exam is based on the lecture notes and readings, a review will be given before the exam to help you prepare for it. 
 
課程進度
週次
日期
單元主題
第1週
9/16  Introduction to syllabus
History of IR; data vs. information retrieval 
第2週
9/23  Advanced search with PubMed; introduction to search features with PubMed/Ovid/Ebsco/EMBASE
Discussion of your search demo project 
第3週
9/30, LAB  Search strategies tactics ; PICO;
Camtasia demo (laptop)
Discussion of your search demo project 
第4週
10/07  Indexing exhaustivitiy vs. specificity
Automatic index basic (text analysis, term weighting) 
第5週
10/14  Search feature/command demo due 
第6週
10/21  Demo of ctext.org at the lab; TF*IDF tool
Demo Corpro 
第7週
10/28  IR evaluation;
Discussion of your second (IR evalaution project) 
第8週
11/04  IR models I: Boolean; term weighting and vector space model; similarity measures;
Discussion of your IR evalaution project
Homework/automatic indexing due 
第9週
11/11  Relevance feedback and query expansion;
Discussion of your IR evaluation project 
第10週
11/18  Simulated search evaluation presentation 
第11週
11/25  Facet analysis and information architecture
Wordpress demo at computer lab
Discussion of your DL project 
第12週
12/02  IR model II: Probability model
Discussion of your DL project 
第13週
12/09  IR model: probablitic and language models
Discussion of your DL project 
第14週
12/16  Lab session with your DL project 
第15週
12/23  DL assignment presentation  
第16週
12/30  Web search and link structure 
第17週
1/06  Final review 
第18週
1/13  Final exam